Multiple Time Resolutions for Derivatives of Mel-frequency Cepstral Coefficients

نویسندگان

Georg Stemmer

Christian Hacker

Elmar Nöth

Heinrich Niemann

چکیده

Most speech recognition systems are based on melfrequency cepstral coefficients and their firstand secondorder derivatives. The derivatives are normally approximated by fitting a linear regression line to a fixed-length segment of consecutive frames. The time resolution and smoothness of the estimated derivative depends on the length of the segment. We present an approach to improve the representation of speech dynamics, which is based on the combination of multiple time resolutions. The resulting feature vector is transformed to reduce its dimension and the correlation between the features. Another possibility, which has also been evaluated, is to use probabilistic PCA (PPCA) for the output distributions of the HMMs. Different configurations of multiple time resolutions are evaluated as well. When compared to the baseline system a significant reduction of the word error rate can been achieved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Noise-Robust Speech Features Based on Cepstral Time Coefficients

In this paper, we investigate the noise-robustness of features based on the cepstral time coefficients (CTC). By cepstral time coefficients, we mean the coefficients obtained from applying the discrete cosine transform to the commonly used mel-frequency cepstral coefficients (MFCC). Furthermore, we apply temporal filters used for computing delta and acceleration dynamic features to the CTC, res...

متن کامل

Acoustic Emotion Recognition Using Linear and Nonlinear Cepstral Coefficients

Recognizing human emotions through vocal channel has gained increased attention recently. In this paper, we study how used features, and classifiers impact recognition accuracy of emotions present in speech. Four emotional states are considered for classification of emotions from speech in this work. For this aim, features are extracted from audio characteristics of emotional speech using Linea...

متن کامل

Real Time Speech Recognition Using DSK TMS320C6713

Speech recognition is an important field of digital signal processing. Automatic Speaker Recognition (ASR) objective is to extract features, characterize and recognize speaker. Mel Frequency Cepstral Coefficients (MFCC) is most widely used feature vector for ASR. MFCC is used for designing a text dependent speaker identification system. In this paper the DSP processor TMS320C6713 with Code Comp...

متن کامل

What is in the Dynamic Features: Analysis of the Derivatives of Log-Mel-Spectra

The present investigation analyses the behaviour of the first order derivatives of the log-mel-spectrum of vowels which constitutes the basis for the mel-frequency cepstral coefficients (MFCC). The results indicate that the dynamic features when inspected at log-mel-spectra level seem to be less influenced by speaker specific characteristics and degrade less in fast speech. However, when analys...

متن کامل

Mel Frequency Cepstral Coefficients Based Pattern Recognition for Limb Motor Action

This paper proposes a Mel Frequency Cepstral Coefficient (MFCC) based hybrid algorithm for motor imagery classification of Electroencephalogram (EEG) signal for Brain Computer Interface (BCI). The proposed hybrid algorithm contains MFCC with Hjorth Parameter. Regression coefficient method was used for eye artifacts cancellation. The feature extraction method based on the difference of the diffe...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2001

Multiple Time Resolutions for Derivatives of Mel-frequency Cepstral Coefficients

نویسندگان

چکیده

منابع مشابه

Noise-Robust Speech Features Based on Cepstral Time Coefficients

Acoustic Emotion Recognition Using Linear and Nonlinear Cepstral Coefficients

Real Time Speech Recognition Using DSK TMS320C6713

What is in the Dynamic Features: Analysis of the Derivatives of Log-Mel-Spectra

Mel Frequency Cepstral Coefficients Based Pattern Recognition for Limb Motor Action

عنوان ژورنال:

اشتراک گذاری